What's Wrong with Automatic Speech Recognition (asr) and How Can We Fix It?
نویسندگان
چکیده
منابع مشابه
Automatic Speech Recognition for Second Language Learning: How and Why It Actually Works
In this paper, we examine various studies and reviews on the usability of Automatic Speech Recognition (ASR) technology as a tool to train pronunciation in the second language (L2). We show that part of the criticism that has been addressed to this technology is not warranted, being rather the result of limited familiarity with ASR technology and with broader Computer Assisted Language Learning...
متن کاملWhat's the difference? comparing humans and machines on the Aurora 2 speech recognition task
The comparison of human speech recognition (HSR) and machine performance allows to learn from the differences between HSR and automatic speech recognition (ASR) and serves as motivation for using auditory-inspired strategies in ASR. The recognition of noisy digit strings from the Aurora 2 framework is one of the most widely used tasks in the ASR community. This paper establishes a baseline with...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملUncertainty training and decoding methods of deep neural networks based on stochastic representation of enhanced features
Speech enhancement is an important front-end technique to improve automatic speech recognition (ASR) in noisy environments. However, the wrong noise suppression of speech enhancement often causes additional distortions in speech signals, which degrades the ASR performance. To compensate the distortions, ASR needs to consider the uncertainty of enhanced features, which can be achieved by using t...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013